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ABSTRACT 

This study investigates how measured problem solving 
abilities differ by students' mathematics curriculum, traditional or National 
Council of Teachers of Mathematics (NCTM) Standards-oriented, when a non- 
paper-and-pencil test is used as the source of measurement, and if the 
proposed rubric is efficient and effective for scoring the non-paper-and- 
pencil test. Results indicate that it can be very efficient to score a non- 
paper-and-pencil test. The rubric is provided as the appendix. (KHR) 
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NCTM-oriented versus Traditional Problem-solving Skills 
Problem solving is the process of finding a solution path when the path is not 
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obvious. This study is an extension of a previous problem-solving study (author, 2000) 
The research questions for the original study which were centered on the testing of 
problem solving follow. 

1. What is the nature and extent of mathematical problem-solving ability that is 
measured by (a) a general problem-solving test in multiple-choice format and 
(b) a curriculum-based problem-solving test in multiple-choice format? 

2. What is the nature and extent of mathematical problem-solving ability that is 
measured by “equivalent,” constructed-response versions of the two tests? 

3. How does the measured problem-solving ability differ by test format, that is, 
multiple choice versus constructed response? 

4. How does the measured problem-solving ability differ by the students’ 
mathematics curriculum, that is, traditional algebra or National Council of 
Teachers of Mathematics Standards-oriented integrated mathematics? 

The NCTM-oriented curriculum used was the Core-Plus Curriculum (CPMP, 
Coxford, Fey, Hirsch, Schoen, Burrill, Hart, & Watkins, 1997). To answer the questions, 
four tests were administered to approximately 550 ninth-graders. Test Q: Ability to Do 
Quantitative Thinking , a subtest from Iowa Tests of Educational Development (Feldt, 
Forsyth, Ansley, & Alnot, 1993), was used in regular multiple-choice format and as a 
parallel form with open options. Another test was constructed to test the placement of 
students in CPMP. This test was also administered in multiple-choice and open-ended 
format. In addition, various questionnaires were administered to determine students’ 
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opportunity to learn, classroom environment and an expert analysis of the content of the 
tests. 

Results for the various questionnaires indicated that students have had the 
opportunity to learn the needed skills. The classroom environments were quite similar. 
Generally, the experts did not give the tests strong endorsements as measures of problem 
solving. 

A 2x2x2 repeated measures ANOVA was used to determine significant 
differences. There were no significant differences between the two types of tests (p-value 
= .23). There were significant differences in the ability of the open-ended tests versus the 
parallel multiple-choice tests to measure problem solving as defined by the measures 
used (p-value = .01), although these differences were likely due to the scoring process of 
the tests rather than the formats, per se. There were no significant differences between the 
Core-Plus schools and the traditional schools (p-value = .77). There were no significant 
two- or three-way interactions (p-values ranged from .66 to .93). 

Some mathematics educators suggest that alternative testing formats would yield 
different results (Cooney, Bell, Fisher-Cauble, & Sanchez, 1996; Hancock, 1995; Mayer 
& Hillman, 1996; Schoenfeld, 1992). So where the previous study looked at written 
problem-solving performance, this study engages triads of students from each curriculum 
in collaborative think-aloud problem-solving sessions. Alternative testing formats are 
difficult in the scoring process. In addition to giving the results for students in two 
different curricula, this study reports on a scoring process for such alternative testing. 

In summary, this study continues the previous study with a goal to detect and 
describe how students from the two curricula (CPMP and traditional) might differ in their 
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approaches to problem solving and extends the previous study by the additional goal of 
describing how this type of assessment might be scored. 

The research questions follow. 

1. How do the measured problem-solving abilities differ by students’ 
mathematics curriculum, traditional or NCTM Standards-oriented, when a 
non-paper-and-pencil test is used as the source of measurement? 

2. Is the proposed rubric efficient and effective for scoring the non-paper-and- 
pencil test? 

Students 

Twelve CPMP students and 24 traditional students enrolled in a large midwestem 
school district were involved in the study. The numbers were limited by the need for 
written parental permission. The school district serves a large urban city whose primary 
industries are retail, health and education. There are 27 schools in the district with 
approximately 700 teachers and a teacher to student ratio of 20. The city has three 
hospitals, a public and a private university and a community college. There are both 
private and public elementary and secondary schools. The public schools serve 75% of 
the area’s students. The schools used in this study were public. In the given school 
district, all CPMP students in the two high schools (two different teachers) where CPMP 
curriculum was used were invited into the study. This is a total of approximately 60 
students. Out of this 60, 12 returned their permission slips, and all 12 were used in the 
study. Although it is possible that there is something unique about these 12 students 
(since they returned their permission slips), the involved teachers did not rate these 12 as 
unique in any manner, but rather as representative of their students in CPMP. 
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Approximately 90 traditional students were invited into the study. The 90 students were 
those students enrolled in traditional courses taught by (and thus in the same school as) 
one of the two teachers who taught the CPMP courses. Of these 90, 24 gave permission, 
and these 24 were used. Again, the involved teachers did not rate these 24 as exceptional 
students. If there is something unique about students whose parents return permission 
slips, then it most likely involved both groups. It is difficult to ascertain why parents did 
not tend to return the permission slips. However, there was no incentive for parents to do 
so. 

For involved students, scores on the end-of-eighth grade Metropolitan 
Achievement Test (MAT) math concepts and problem-solving subtest and separately on 
the math procedures subtest were compared. See Table 1. The CPMP students had a 
normal curve equivalent score (NCE) of 507 (standard deviation of 181) on the math 
concepts and problem solving test and a NCE score of 345 (standard deviation of 177) on 
the math procedures subtest, while the traditional students had a NCE score of 606 on the 
math concepts and problem solving subtest (standard deviation of 179) and a NCE score 
of 463 (standard deviation of 157) on the math procedures subtest. There were no 
significant differences between the two groups of students (N =24, N = 12, p = .25 on the 
problem solving test, p = .16 on the procedures test). Although the numbers involved 
were small, the two groups of students in this study were fairly well matched at the 
beginning of ninth-grade according to teachers’ evaluations, students’ previous 
background, and scores on the MAT. During ninth-grade the students were only exposed 
to CPMP or traditional curriculum, respectively. Again, the students were enrolled in one 
of two schools, with one teacher per school involved. 
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Subtest 


Traditional 
N = 24 


CPMP 
N= 12 


p-value 


Math concepts and 


NCE score = 606 


NCE score = 507 


.25 


problem solving 


s.d. = 179 


s.d. = 181 




Math procedures 


NCE score = 463 
s.d. = 157 


NCE score = 345 
s.d. = 177 


.16 



Table 1: Results on the end-of-eighth grade math tests 



Method 

At the end of ninth-grade, one researcher administered orally a problem-solving 
test to the students who were placed into groups of three, with all three from the same 
type of curriculum (CPMP or traditional). The teachers formed the groups 
heterogeneously. The groups left their math class to enter a separate room equipped with 
a video camera. The researcher was in the room to welcome the students and give 
directions. The students were told to “think out loud” and talk to each other as they 
solved two math problems. The problems were written on paper, as well as read to the 
students. The students were also told that it was more important how they solved the 
problem than the actual answer to the problem. Students were prompted to think out loud 
if they were silent. 

The problems were: 

1. How many rectangles are there? 



2. How many keystrokes are needed to put page numbers on a book with 124 
pages? 

To select problems the researchers selected a large set of problems judged to require 
students to make sense of a novel situation and use some sort of strategy. The strategy 
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needed in the case of the selected problems involved how to keep track of which 
rectangles (or digits) have been counted. All problems were judged to be atypical of 
problems from standardized tests but able to assess students’ ability to problem solve. 
Then a panel of experts in problem solving was asked to look at the problems and vote 
for the two that were indeed nonroutine to most ninth-graders. In addition, the second 
problem was one used in the earlier study (author). So it had been given in paper-and- 
pencil format. 

The test administration was videotaped. To analyze the tests, three researchers 
separately filled out a scoring rubric per group while viewing the videotape. See the 
Appendix. The rubric contained three parts for each of the two problems: a score from 0 
to 5 on correctness, a score from 0 to 4 identifying the last stage ever entered in the 
problem-solving process, and a listing of any strategy used. The three sets of scores (one 
from each researcher) were compared and all differences were discussed until agreement 
was reached on a score. Approximately 80% of the scores matched before discussion and 
100% agreement was reached after discussion. So each set of three students had 3 scores 
(correctness, stage, strategy) per each of 2 problems. 

Results 

Correctness 

The researchers scored each of the problems from 0 to 5 in terms of correctness of 
solution, with 5 being fully correct. See Table 2. Due to the small sample size, it is 
difficult to reach statistical significance. To compare the correctness scores, the scores 
were averaged and the mean scores for each group (traditional and CPMP) were 




compared. There was no significant difference on the first problem. On the second 
problem the CPMP students outperformed the traditional students. 
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Problem 


Traditional 


CPMP 


p-value 


1 


3 groups scored 1 

4 groups scored 2 
1 group scored 4 


1 group scored 1 

1 group scored 2 

2 groups scored 4 


p=. 14 


2 


5 groups scored 1 
1 group scored 2 
1 group scored 3 
1 group scored 4 


1 group scored 3 

2 groups scored 4 
1 group scored 5 


p = .01** 



Table 2: Scores ranging from 0 to 5 on Correctness of Problem 
** significant at the .01 level 



Stages 



The researchers scored each problem by the stage in the problem-solving process 
(we use Polya’s stages, 1945/1973) that the group ever entered: 

0 = Did not attempt 

1 = Understanding the problem 

2 = Devising a plan 

3 = Carrying out the plan 

4 = Looking back. 

It was not always easy to determine which stage a group of students entered. 
Nevertheless, three researchers independently assigned a stage based on students talking 
about the problem (from the videotape). All differences in ratings were discussed and 
consensus was reached. See Table 3. 

There was no significant difference on the first problem, and the CPMP students 
outperformed the traditional students in terms of highest stage entered in the problem- 
solving process on the second problem. The researchers make no claim that entering a 
later stage should be taken as evidence that students are doing better at problem solving. 
However, research shows that students rarely enter the looking back stage (Schoen & 
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Oehmke, 1980), and we did feel it important to compare the tendency of these two groups 
(CPMP and traditional) to enter various stages. Of course, there is no evidence that 
CPMP students are more likely to enter the “looking back” stage, as they failed to do so. 



Problem 


Traditional 


CPMP 


/?-value 


1 


4 groups scored 2 
4 groups scored 3 


4 groups scored 3 


p = .10 


2 


1 group scored 1 
5 groups scored 2 

2 groups scored 3 


4 groups scored 3 


p = .02* 



Table 3: Scores ranging from 0 to 4 on Stage Entered 

0 = Did not attempt 1 = Understanding the problem 
2 = Devising a plan 3 = Carrying out the plan 4 = Looking back 
* significant at the .05 level 

Strategies 

The researchers identified the dominant strategy used by each group of students. 
See Table 4. 



Problem 


Traditional 


CPMP 


1 


Brute Force: 8 Groups 
Pair up: 0 Groups 
Mark off: 0 Groups 


Brute Force: 0 Groups 
Pair up: 2 Group 
Mark off: 2 Group 


2 


Brute Force: 6 Groups 
Separate by digits: 2 Groups 
Simpler case: 0 Groups 


Brute Force: 0 Groups 
Separate by digits: 3 Groups 
Simpler case: 1 Group 



Table 4: Predominate Strategy Used by Groups 



On the first problem, all traditional students literally began counting rectangles 
(“brute force”). Two CPMP groups counted all the one-unit rectangles, then all the two- 
unit rectangles, then the three rectangles, etc. (“pair up”). In this method, there are 1-unit 
rectangles (9 of them), 2- (12 of them), 3- (6 of them), 4- (4 of them), 6- (4 of them), and 
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9-unit (1 of them) rectangles. The remaining CPMP groups attempted to make marks and 
keep track of rectangles that were already counted (“mark off’). 

On the second problem, six groups of traditional students attempted to count the 
page numbers’ digits beginning with 1 and going through the final page (“brute force”). 
Two groups of traditional students and three groups of CPMP students separated the 
problem into 1 -digit numbers, 2-digit numbers and 3-digit numbers (“separated by 
digits”). In this method, there are 9 1-digit numbers, 90 2-digit numbers (which each 
contribute 2 keystrokes) and 25 3-digit numbers (which each contribute 3 keystrokes). 

So, the answer is 9*l + 90*2 + 25«3 = 9 + 180 + 75 = 264. One CPMP group tried a 

simpler case by considering books with fewer pages 

Rubric 

The rubric itself allowed us to tease apart aspects of problem solving that most 
paper-and-pencil tests do not. Even so, it was difficult to describe various strategies. The 
researchers used Polya’s How to Solve It book (which lists hundreds of strategies) to 
establish common language. 

Conclusions 

The first research question was: how do the measured problem-solving abilities 
differ by students’ mathematics curriculum, traditional or NCTM Standards-oriented, 
when a non-paper-and-pencil test is used as the source of measurement? To answer this 
question only in terms of correctness, there was basically equal performance on the first 
item, with both groups (CPMP and traditional) able to solve the problem. On the second 
item, the CPMP groups were significantly better than the traditional students. 
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In terms of stages entered, neither group (CPMP and traditional) entered the 
looking back stage, but the CPMP students consistently entered the third stage (“carrying 
out the plan”). Traditional students were in lower stages on the second item. 

If the rubric had only measured the previous characteristics, one might conclude 
that the two groups (CPMP and traditional) were almost equal. However, the third 
measurement recording the dominant strategy used allowed some special characteristics 
of the CPMP group to become clear. Traditional students were more likely to adopt a 
straightforward “compute” strategy, and the CPMP students were more likely to be more 
sophisticated in their strategies. It seems that there is some evidence then that a non- 
paper-and-pencil test will detect differences between these two groups of students that a 
paper-and-pencil test will not. On paper-and-pencil tests (especially of a multiple-choice 
format), it is difficult to determine the strategies that students are using. 

Recognizing obvious limitations of this small-scale study, the CPMP groups 
performed as well or better on each of these two problems in terms of correctness, and 
use of a wider range of Polya-type problem-solving behavior. In addition, at least at 
times, the CPMP students were more sophisticated in the strategies that they used, and 
never less sophisticated in the strategies used than the traditional students. 

Further, we suggest that it can be very efficient to score a non-paper-and-pencil 
test. The rubric provided as the appendix worked quite well. A listing of strategies might 
help the scorers. 

This study needs to be replicated with a larger sample. In addition, it would be 
interesting to interview the students as individuals, to remove the grouping (CPMP 
students might have an advantage in groups, since they frequently learn that way). 
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APPENDIX: Scoring the Problem Solving 
Circle: Traditional CPMP 

Part I: Is the answer correct? If not, list what was wrong. Score from 0 to 5 points. 

Problem 1: 

There are 36 rectangles. 

1 rectangle - 9 

2 rectangles - 12 

3 rectangles - 6 

4 rectangles - 4 
6 rectangles - 4 
9 rectangles - 1 
9+12+6+4+4+1=36 rectangles 

Problem 2: 

It takes 264 keystrokes. 

Pages 1-9, one digit, 9x1 = 9 keystrokes 
Pages 10-99, two digits, 90 x 2 = 180 keystrokes 
Pages 100-124, three digits, 25 x 3 = 75 keystrokes 
9+ 180 + 75 = 264 keystrokes 

Part II: What stages, if any were entered in the problem-solving process? 

Score from 0 to 4. 

0 = Did not attempt 

1 = Understanding the problem 

2 = Devising a plan 

3 = Carrying out the plan 

4 = Looking back 

Problem 1 

Problem 2 

Part III: List any strategy that was used. Identify the dominant strategy. 

Problem 1 



Problem 2 
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